A noise-robust system for NIST 2012 speaker recognition evaluation

نویسندگان

  • Luciana Ferrer
  • Mitchell McLaren
  • Nicolas Scheffer
  • Yun Lei
  • Martin Graciarena
  • Vikramjit Mitra
چکیده

The National Institute of Standards and Technology (NIST) 2012 speaker recognition evaluation posed several new challenges including noisy data, varying test-sample length and number of enrollment samples, and a new metric. Target speakers were known during system development and could be used for model training and score normalization. For the evaluation, SRI International (SRI) submitted a system consisting of six subsystems that use different lowand high-level features, some specifically designed for noise robustness, fused at the score and iVector levels. This paper presents SRI’s submission along with a careful analysis of the approaches that provided gains for this challenging evaluation including a multiclass voice-activity detection system, the use of noisy data in system training, and the fusion of subsystems using acoustic characterization metadata.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UTD-CRSS Systems for 2012 NIST Speaker Recognition Evaluation The CRSS SRE Team

This document briefly describes the systems submitted by the Center for Robust Speech Systems (CRSS) from The University of Texas at Dallas (UTD) for the 2012 NIST Speaker Recognition Evaluation. We developed a state-of-the-art i-vector based speaker recognition system [1]. Probabilistic linear discriminant analysis (PLDA) [2] along with several other backends are used for channel/noise compens...

متن کامل

Quality measures based calibration with duration and noise dependency for speaker recognition

This paper studies the effect of short utterances and noise on the performance of automatic speaker recognition. We focus on calibration aspects, and propose a calibration strategy that uses quality measures to model the calibration parameters. We carry out the proposed calibration by using simple Quality Measure Functions (QMFs) of duration and measured signal-to-noise-ratio from speech segmen...

متن کامل

The I3a speaker recognition system for NIST SRE12: post-evaluation analysis

The I3A submission for the recent NIST 2012 speaker recognition evaluation (SRE) was based on the i-vector approach with a multi-channel PLDA classifier. This PLDA is modified so that, for each i-vector, the between-class covariance depends on the type of channel where the segment was recorded (telephone,interviews,clean, noisy, etc). In this paper, we present the description of our submission ...

متن کامل

i-Vector Transformation Using a Novel Discriminative Denoising Autoencoder for Noise-Robust Speaker Recognition

This paper proposes i-vector transformations using neural networks for achieving noise-robust speaker recognition. A novel discriminative denoising autoencoder (DDAE) is employed on i-vectors to remove additive noise effects. The DDAE is trained to denoise and classify noisy i-vectors simultaneously, making it possible to add discriminability to the denoised i-vectors. Speaker recognition exper...

متن کامل

Performance factor analysis for the 2012 NIST speaker recognition evaluation

The 2012 NIST Speaker Recognition Evaluation, held in the autumn of 2012, was designed to examine a variety of factors affecting the performance of automatic systems for speaker recognition. Here we examine, for leading systems included in this evaluation, the observed effects on performance of five such factors: the inclusion in test segment speech of environmental noise or of added synthetic ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013